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Based on Stein's method, we derive upper bounds for Poisson process approximation in the 
Li-Wasserstein metric d% , which is based on a slightly adapted L p - Wasserstein metric between 
point measures. For the case p = 1, this construction yields the metric di introduced in [Barbour 
and Brown Stochastic Process. Appl. 43 (1992) 9-31], for which Poisson process approximation 
is well studied in the literature. We demonstrate the usefulness of the extension to general p by 
showing that d^' -bounds control differences between expectations of certain pth order average 
statistics of point processes. To illustrate the bounds obtained for Poisson process approximation, 
we consider the structure of 2-runs and the hard core model as concrete examples. 
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1. Introduction 

Stein's method is a very powerful and flexible tool for deriving upper bounds for distances 
between probability distributions. Since its first publication in Stein (1972), where it was 
limited to normal approximation, the method has been extensively studied and adapted 
to a wide range of different distributions; see Barbour and Chen (2005) for a compre- 
hensive overview. In Barbour and Brown (1992) (see also Barbour, Hoist and Janson 
(1992) for discrete state spaces and the earlier results in Arratia, Goldstein and Gordon 
(1989) and Barbour (1988)) Poisson process approximation by Stein's method was de- 
veloped both in the total variation metric and in a particular Wasserstein metric, de- 
noted by c?2, that has proved to be more suitable for the problem of point process ap- 
proximation. In Brown and Xia (2001) (after an earlier more complicated version in 
Brown, Weinberg and Xia (2000)) a partial improvement of the ^-approximation was 
offered that was able to remove in many cases a rather annoying logarithmic factor from 
the upper bound. For a fine overview of Stein's method for Poisson process approximation 
see Xia (2005). 
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In the present paper we use Stem's method to give upper bounds for Poisson process 
approximation in a generalized ^-metric, which we denote by , where p £ [1, oo] and 
d^ = di- This generalization enables us to draw wider conclusions from the resulting 
estimates. In particular, we have that any upper bound obtained for a d^ -distance 
controls also the difference between the expectations of statistics that are based on the 
pth order average of certain distance features within the point processes, whereas often 
the same is true only for the standard (first-order) average in the case of c^-bounds. The 
price to be paid for this improvement is that the upper bounds we obtain are in general 
somewhat worse. However, for p < oo they are still better than the corresponding total 
variation estimate, and they do not contain the infamous logarithmic factor that usually 
appears in the estimates for p = 1 . 

(r>) 

The paper is organized as follows. In Section 2 we give the definition of d 2 and discuss 
some of the elementary properties (Section 2.1). We furthermore present examples of 
the pth order average statistics mentioned above (Section 2.2). Section 3 contains our 
main result. After stating the general upper bound for Poisson process approximation in 
Section 3.1, we compute two examples in concrete situations (Section 3.2), before proving 
the bound in Section 3.3. 

2. The Wasserstein metrics 
2.1. Notation and definitions 

We always consider a compact metric space (X, do) with do < 1 as the state space of our 
point processes and equip it with its Borcl cr-algcbra B. Denote the space of all finite point 
measures on X by 01 and equip it as usual with the vague topology and the er-algebra 
M generated by this topology, which is the smallest a-algebra that renders the point 
counts of measurable sets measurable (see Kallenberg (1986), Section 1.1, Lemma 4.1, 
and Section 15.7). Recall that a point process is just a random element of 01. 

We first define metrics d^ on OT that are based on an L^-Wasserstein construction. 
Denote the set of permutations of {1,2, ...,n} by II n . For any £ = Xa=i ar >d n — 



ES^eori, let 





if l£l + H 
if \z\ = M 



= 



for 1 < p < oo , and let 




if 
if 



if 



|£| = M = n>l, 

KI = M = o. 
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It is straightforward (in fact immediate, except for the triangle inequality, which can be 
proved by Minkowski's inequality) that d\ , 1 <p < oo, are metrics and bounded by 1. 
By applying Lyapunov's inequality, we obtain that 

d[ p) < d[ q) for 1 <p < q < oo, (2.1) 

and with the help of this result it can be seen that metrizes the vague topology for 
any p. Furthermore it can be shown that (9t, d^ ) is complete and separable (the latter 
follows directly from Result 15.7.7 in Kallcnbcrg (1986)). 

Next we define the metric d% ^ on the space ^(91) of probability measures on 91, which 
is the usual Li -Wasserstein metric with respect to d± . Let ■— {/ : 9t — > [0, 1] ; — 
f(v)\< d[ p) (£, V) for all £, r\ G 91} . Set then for any P,Q<E <P(91) 

4 P) (P,Q):= sup 



/dP- / /dQ 

91 



Since this is exactly the Wasserstein construction (the fact that we restrict the functions 
in to be [0, l]-valued has no influence on the supremum because the underlying of - 
metric is bounded by 1), it is clear that a% , 1 < p < oo, are metrics, obviously bounded 
by 1, and that general results about Wasserstein metrics apply. One such result is the 
well-known Kantorovich-Rubinstein theorem, which in our situation states that 

4 P) (P, Q) = min E<4 P) (S, H) 

for any P,Q £ *P(01) , where we use notation of the form Z ~ R to indicate that a random 
element Z has distribution R. Furthermore it is clear by inequality (2.1) that 

4 P) < d { 2 ] for 1 < P < q < oo, (2.2) 

and it follows, by the facts that d\ metrizes the vague topology and that d 2 is also 
the bounded Wasserstein metric, that d^ metrizes convergence in distribution of point 
processes (see Dudley (1989), Theorem 11.3.3). 

To the author's knowledge, d^ has not been considered before as a metric on *p(*JT) , 
except for p = 1 (as mentioned in the Introduction) and for p = oo (in Xia (1994) and 
Schuhmacher (2005a)). 

2.2. Applications of distance estimates 

By the definition of c4 > an upper bound of d!f^ (J£ (H), _Sf (H)) controls also the difference 
|E/(S) — E/(H)| for any function / € \ It is thus of considerable interest in order to 
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apply the upper bounds obtained in Theorem 3. A, to have a certain supply of "mean- 
ingful" af -Lipschitz continuous statistics of point patterns (where we do not worry too 
much about the Lipschitz constant as it will only appear as an additional factor in the 
upper bound). One way in which such statistics can then be used is to test if a given 

point pattern is a realization from among a certain class of point process distributions 

(p) 

that are all known to lie within some d 2 -distance e of a Poisson process distribution 
(e.g., according to our example in Section 3.2.2, the class of hard core processes with 
fixed intensity A and hard core radius r below some level g > 0). The fact that the test 
statistic lies in T 2 enables us to control the size of the test in such a way that it lies 
only slightly below some required level a if e is small. A detailed application of this idea 
in the case p — 1 was presented in Schuhmachcr (2005b), Section 3.2. 

The examples of af -Lipschitz continuous statistics / given below are all pth order 
averages of certain distance features within the point measure. In each case we tacitly set 
/ to zero where the stated definition does not apply (e.g. for n < I in Proposition 2. A). 
The proofs of the propositions are given in the Appendix. 

Our first example concerns pth order /7-statistics with Lipschitz continuous kernels. 
Note that at least for p = 1 there is a plethora of results available about [/-statistics that 
are based on a fixed number of i.i.d. points (which in the point process framework cor- 
responds to a Poisson process conditioned on its total number of points). See Lee (1990) 
for more information. For p = 1, a class of functions similar to those in Proposition 2. A 
was proposed in Barbour, Hoist and Janson (1992), Section 10.2. 

Proposition 2. A. Take I G N and let K : Z + xX l ^ [0, 1] be a function that is symmetric 
in the last I arguments and satisfies 

1 * 

\K(m;ui,u 2 , . . . ,ui) - K(m;vi,v 2 , . . -,vi)\ < -j^d (ui,Vi) (2.3) 

i=l 

for all m S Z + and all Ui, u 2 , ■ ■ ■ , it;, V\, v 2l . . . , vi G X . Define /:9T— > [0, 1] by 

f(0~KW(0:=(-L- E K(n; Xil ,x i2 ,..., Xil )A (2.4) 

\ \l) l<i 1 <i 2 <---<i l <n J 

for £ = X)"=i £ ^ with n> I, and 1 < p < oo. Then f € T 2 . 

Instead of (2.4), we may also consider the centered pth order average, which for the 
case p = 2 gives us the standard deviation of (K(n; , Xi 2 , . . . ,a:i l ))i<i 1 <i 2 <...<i l < Tl - 



Proposition 2.B. Let K be as in Proposition 2. A and K := K^> . Define f : 9t — ► [0, 1] 
by 

/(£):= (iy E \K(n ] x il ,x i2 ,...,x il )-K(0\ p ) (2.5) 

\ \U l<ii<i 2 <— <ii<n I 
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for £ = Y)7—i $xi G with n>l, and 1 < p < oo . Then f is drf -Lipschitz continuous with 
constant 2. 

One basic choice for the function K in the above results is half the intcrpoint distance, 
that is, K(m;ui,ii2) — Ko(ui,U2) ■= |do(i*i, u 2 ) for all m E N and ui,u 2 E X. By the 
triangle inequality for do it is immediately seen that inequality (2.3) holds. This choice 
allows several extensions to functions K that are based on more than two points. One is 
half the average interpoint distance in groups of size I > 2, that is, 



Note that this function is only of interest for p > 1, since for p = 1 and any I > 2 we just 
obtain the same values /(£) as under Kq for any £ that has at least I points. Let X C M. D , 
where for the sake of simplicity we assume that diam(A") := max{|a; — y\;x, y G X} < 1, 
and set do(x,y) := \x— y\. Then two more extensions are given as 2/1 times the radius of 
the minimal bounding ball and 1/(2(1 — 1)) times the average distance to the geometrical 
centroid (center of gravity) in groups of size I; that is, for / > 2 and u\, . . . ,ui G X, 

2 n 
K2(ui, ...,Ui) := — min{r > Q\3x e K such that ui,...,ui& M(x,r)}, 

where M(x,r) denotes the closed Euclidean ball with center at x and radius r, and 



For all of these functions K t , t G {1,2,3}, inequality (2.3) is straightforward to check by 
showing that 



for all u, v, «2, . . .,ui £ X and using the symmetry of K t . More examples, some of which 
also have corresponding extensions to groups of size I, can be found in Schuhmacher 
(2005a). 

Another -Lipschitz continuous function is the pth order average of the nearest 
neighbor distances in a finite point measure on MP , where D eN. This statistic gives 
important information about the amount of clustering in a point pattern. 

Proposition 2.C. Let X C M. D and do(x,y) := \x — y\ A 1 for all x,y E X. Define the 
function f : 01 — > [0,1] by 





\K t (u,u 2 , ■ ■ ■ ,ui) - K t (v,u 2 , . . . ,ui)\ < -do(u,v) 
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for £ = y~™—i S Xi £ yi with n > 2, and 1 < p < oo. Then f is af -Lipschitz continuous 
with constant td + 1 for p = 1 and 2(2t£> + l) 1 /? for general p, where Try denotes the 
kissing number in D dimensions (i.e., the maximal number of unit balls that can touch 
a unit ball in {MP , | • |) without producing any overlaps of the interiors; see Conway and 
Sloane (1999), Section 1.2, for details). 

3. Distance bounds 

In this subsection the main theorem is stated. We give an upper bound for p € [1, oo] of the 
d 2 -distance between the distribution of a general point process 2 and a Poisson process 
with the same expectation measure. The result is a generalization of Theorem 5.19 in 
Xia (2005) (the case p — 1), which in turn is ultimately based on Theorems 3.6 and 3.7 in 
Barbour and Brown (1992) (but incorporates among other things certain improvements 
made in Brown and Xia (1995a) and Chen and Xia (2004)). 

3.1. Statement of the main theorem 

We always consider a point process 3 on X that has finite expectation measure A. Let 
2^ be the Palm process of 2 given a point in x (i.e. any point process that is distributed 
according to the Palm distribution of 2 given a point in x); see Kallenberg (1986), 
Chapter 10, for formal details or Xia (2005), Section 2.3.1, for a concise overview. Write 
A := |A| for the total mass of A, and denote by Po(A) the Poisson process distribution 
with expectation measure A, and by Po(A) the Poisson distribution with expectation A. 

Call a family {N x } xe x of measurable subsets N x a X & neighborhood structure if 
x <E N x and the mapping [Vl x X — > Dt, (£, x) i— > £|iVj] is (Af ® S)-A/"-measurablc. This is 
the case if N(X) := {(x,y) £ X 2 ;y G N x } is 2? 2 -measurable (see Chen and Xia (2004), 
after Formula (2.4)). Note that N x does not have to be a neighborhood of x in the 
topological sense. 

If /x is a finite measure on X, then we say that the density conditions are satisfied 
for 2 (with respect to the reference measure /x) if 2 is a simple point process, and the 
Janossy densities j n : X" — > R + with respect to fi n exist for n > and are hereditary 
(i.e., j n {xx,. ..,x n ) = implies j n+ i{xx,. ..,x n ,x n+1 ) = for all x x ,.. .,x n ,x n+1 G X). 
In this case, it can be seen that a density <fi : X — * R + of the expectation measure A 
with respect to fj, exists. Write furthermore g(x;£) for the conditional density of having 
a point of H in i given that 2|n c = £■ See Xia (2005), Section 2.3.2, for details on 
Janossy densities and the definition of g, and see Schuhmachcr (2008), Section 2.4 and 
Remark A.C, for the reason why hereditarity (or a similar property) is needed, as well as 
for an alternative approach using densities with respect to a Poisson process distribution 
rather than Janossy densities. 

Define the metric d[ on $1 by d[(^,n) := (m — n) + min wg n m Y^i=i do{xi, y^^)) for 
£ = Y^i=i^ x i an d V = Y^iLi^Vi i 1 n — m i an d d,[(£,,ri) := d[(r),£) otherwise. Let kq := 
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4e/(l + 4e - y/1 + 8e) w 1.53, «i := 2 - k , 



7i 



and 



^ + 2(K e Ko )- 1/2 <1.61, 
2~ 



+ 2 



P 



PKq 1/P -Kl(p- 1) 



1 P 



P-1 y/2ep-l \Vte( P K 



(2-p)/(2(p-l)) 



P-2 



.1-1/p 




Kl(p-1)) 



(p-2)/p 



ifp=l, 

if l<p<2, 
if 2 < p < 00, 



(p) _ (l + 2 1 /p + ( 2 /3) 1 /py 
72 - — 



(p-l)(2p-l) 



for 1 < p < 00 . 



To get an impression of the behavior of 7^ and 72 as functions of p, see Figure 1. 



Theorem 3. A. For any point process S on X with expectation measure A and any 
neighborhood structure {N x ) x ^x, we have 

4 p) (^(S),Po(A)) 

<C { 2 P \X)( f \(N x )\(dx)+E [ (S(iV a ) - l)S(cLr) ) +min(e 1 ,e 2 ), 



,v 



X 



where 



£l =c< p) (A)E / I^HI^-^IMdx), 
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which is valid if the density conditions are satisfied for E with respect to fj,, and 

e 2 = 4 p) (A)E / d[(E\ Ns ,S x \ Ns )X(dx). 
J x 

The factors (A) and c^' (A) are given by 

c (p) (A) = /min(l,7p' ) A- 1 / ,nax ( 2 ' p )) ) if I <p<oo, 
1 \l, ifp = oo, 

and 

( min(l,^[l + 21og + (6A/ll)]A- 1 ), ifp=l, 

4 P) ( A ) = S min(l, 7 ^ ) A- 1 /P), if 1< p < oo, 

I 1, if p = co. 

(v) 

Remark 3.B. Note that 72 — > 00 for p— » 1, which is consistent with the fact that 
(A) is not of the form "constant times A -1 " but contains an extra factor of order 
log(A). The presence of this factor in the upper bound of the 0% -distance has caused 
much discussion over the years, especially since no such factor is present in the corre- 
sponding upper bound of the total variation distance between the distributions of the 
total numbers of points (see Barbour and Brown (1992), Theorem 3.10). 

It was shown in Brown and Xia (1995b) that with the current proof technique this 
factor cannot be omitted in a general setting (more precisely, that the estimate in Propo- 
sition 3.H(ii) is of the correct order if p = 1). In Brown et al. (2000) and Brown and Xia 
(2001) non-uniform bounds of the term A 2 h in Proposition 3.H(ii) were given, with the 
help of which the authors were able to dispose of the logarithmic factor in many im- 
portant special cases. However, there is currently no general result available that can 
do without the logarithm. Very recently, Rollin (2008) gave an example of a point pro- 
cess E, for which the (exact) order of (_§f (E),Po(A)) for A — > 00 contains an extra 
factor log(A) as compared to the order of dry(jSf (|S|),Po(A)). This example makes the 
logarithmic term in (A) appear rather natural. 



3.2. Examples 

In order to illustrate how the bounds given in Theorem 3. A can be used in concrete 
situations, we present two quick examples. 

3.2.1. Process of 2-runs 

This application has been considered for p = 1 in Section 6.2 of Xia (2005). The corre- 
sponding arguments remain largely the same. 

Let X = [0, 1], do < 1 an arbitrary metric on X , and choose < zi < Zi < ■ ■ ■ < z n = 1- 
Consider i.i.d. indicator random variables I\, I2, ■ ■ ■ , I n with expectation p. In order to 
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avoid edge effects, we interpret the indices 1,2, ...,n as the elements of the quotient 
ring Z„ := Z/nZ (so that n + 1 = 1 and 1 — 1 = n). Define indicators Ji :— ijij+i for 
i € Z„. Then 5 := X^ez ■^z. i s a point process on X with expectation measure A = 
SiGZ V 2 ^zi, which describes the starting points of 2-runs in the process Xiez h$zi- 

Applying Theorem 3. A is straightforward. Setting N Zi :— {2^}, we can immediately sec 
that 

/ \{N x )\(dx) ^np 4 and E f (E(N X ) - l)S(dx) = 0. 
Jx Jx 

We give an upper bound for the term e^. As a concrete Palm process we may choose 
S Zi =S Zi +Ii-iS Zi _ 1 + Ii+2$z i+1 + JjS Zj . 

jGSS»\{»-l,»,t+l} 

For bounding d[(E\Nc ,S 2 Jtv5 ), pair each point of with the corresponding point 

of S^Jtv . at the same position, which gives a perfect match except at z%-\ and z%+i, 
where it can happen that E Zi |jvo has a point, but S|jyc has none. Thus 

<ii(E|jvj. j 3z<|jv=. ) < ^i-i — Ji-i + h+i — Ji+l = £t-i(l — ^t) + — ^i+2), 

which implies that 

E 2 <cf ) (\)2np 3 (l-p). 
Collecting the various estimates, we obtain the following result. 

Proposition 3.C. With the above assumptions we have 

( £[l + 21og+(6np 2 /ll)]-p(2-p), ifp=l, 
4 P) (^(S),Po(A))< <^ 7 W(np 2 ) 1 - 1 /p. p (2- J5 ), i/Kp<oo, 
[ np 2 ■ p(2 - p) , if p = co. 

Remark 3.D. In Theorem 6.3 of Xia (2005) it is shown that the logarithmic factor 
for p = 1 can be disposed of at the cost of a higher constant and a considerably more 
complicated proof. 

Remark 3.E. The maybe more obvious choice of N Zi := Zi+i} for the proof of 

Proposition 3.C, which implies that £\ = £2 = in Theorem 3. A, would in fact yield a 
somewhat worse bound, where the factor p(2 — p) is replaced by p(2 + 3p). 

3.2.2. Hard core process 

This application has been considered for p= 1 in Barbour and Brown (1992) (see after 
Theorems 2.4 and 3.6), with an important correction in Brown and Greig (1994). The 
arguments below are largely the same as in the latter article. 
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Let X = [0, 1] D and do < 1 an arbitrary metric on X. In order to avoid edge effects, we 
shall assume that the torus convention holds, which will become important below when 
we measure Euclidean distances \x — y\. Let /it be Lebesgue measure on X, and consider 
a stationary hard core process 2 with expectation measure A = Xfi for A > and with 
hard core radius r > (note that r cannot be above a certain threshold ro(A) > that is 
determined by A). Such a process may be specified by its Janossy densities with respect 
to /x, given by 



where c and f3 are chosen in such a way that X^^o Ix n ( n ') ~ 1 -?«( x )/' t ™( a - x ) = 1 (correct 
normalization for j n to be Janossy densities) and YlnLo Ix^-i 71 *) jn+i{x,y)fJ' n (dy) = A 
for every x G X (correct density of expectation measure, 4>{x) = A). 

We can easily see that the density conditions are satisfied for S, and we can thus apply 
Theorem 3. A and make use of the term e\. Setting N x {x}, it is immediately clear 
that the first two summands in the upper bound are zero. A short computation (see 
Brown and Greig (1994), Section 3) shows that g(x;£) = (3l[£(B(x, r)) = 0], where B(x,r) 
is the closed Euclidean ball with center at x and radius r, and that P[E(M(x,r)) = 0] = 
P[S|jv<j(B(x,r)) = 0] = A//3. By these two equations it can be easily seen that 



where a>D denotes the volume of B(0, 1). Thus Theorem 3. A yields the following result. 
Proposition 3.F. With the above assumptions we have 



Remark 3.G. Following the arguments in Section 4 of Brown and Greig (1994), it can 
be seen that the constant 2 in Proposition 3.F can be improved to 1.5 by choosing N x :— 
B(.T,r/2), at the cost of an additional condition and a considerably more complicated 
proof. 



Stein's method for Poisson process approximation as originally developed in 
Barbour and Brown (1992) provides us with a general procedure for finding upper bounds 
for a distance term of the form d(Jz? (S), Po(A)) = sup^ ejr |E/(S) — Po(A)(/)| for some 
class T of measurable functions / : 01 — > M. 

The rough idea of this procedure is as follows. First, set up the so-called Stein equation 

as 



j n {xi, ■ ■ ■ ,x n ) = cf3 n l[\xi — Xj | > r for all 1 < i < j < n], 





if 1 < p < co, 
if p = oo . 



3.3. Proof of Theorem 3.A 



/(0-Po(A)(/) = ^h(0 



for £ G % 



(3.1) 
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where s$ is given by 

JX J X 

for suitable functions h : 71 — » R and for £ G 01. Thus srf is the generator of the spatial 
immigration-death process with immigration measure A and unit per capita death rate, 
for which Po(A) plays the special role of being its stationary distribution (see Xia (2005), 
Section 3.2 for more information). Let be such an immigration-death process with 
starting configuration Zf (0) = £ E 71. It can be shown that, if / is bounded, the function 
h = hf.7l^>R, 

h(0 = h f (t):=- [E/(Z e (f))-Po(A)(/)]di (3.2) 
Jo 

is well-defined and solves equation (3.1). Rather than bounding |E/(S) — Po(A)(/)| di- 
rectly, it is then the key idea of Stein's method to bound the equivalent term |Eja//i(S)|, 
which in fact turns out to be a considerably easier task in many situations. 

In Theorem 5.3 of Xia (2005), which is a (very slight) specialization of Theorem 2.3 
in Chen and Xia (2004) , this strategy is employed to give a very versatile but still some- 
what raw upper bound, which incorporates the essence of several of the earlier results 
mentioned in the introduction. Note that we have interchanged / and h in our pre- 
sentation, which results in notation that is more commonly used in the literature (see, 
e.g., Barbour, Hoist and Janson (1992), Barbour and Brown (1992), or Brown and Xia 
(1995a)). A direct consequence of Theorem 5.3 is that, for any bounded measurable 
function / : 91 — > M + and h — hf defined as in (3.2), we have 

|E/(3)-Po(A)(/)| 

(3.3) 

<\\A 2 h\\ 00 ^X(N x )X(dx)+EjjE(N x )-l)E(dx)j +mm(e 1 (h),e 2 (h)), 

where 

e l (h) = \\Ah\\ 00 E [ \g{x-~\ Ni )-4>{x)\»{&x), 
Jx 

which is valid if the density conditions are satisfied for 5 with respect to and 

62(h) = E f \[h(S\ Ns +S x ) - h(E\ N£ )] - [h(E x \ NS + 6 X ) - h(E x \ Ns )]\X(dx). 
Jx 

Here, the suprcmum norms of the first and second differences of h arc defined as 

Halloo == sup \h(£ + 5 x )-h(t)\ 

££m,xEX 

and 

HA^IU- sup \h(t + 8 x + 8 y )-h(t + 5 ai )-h(t + 8 y ) + h(t)\. 
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Note that the above result does not make use of any particular metric d, since it does 
not restrict the choice of functions / to a specific class T ' . The refinement of the result by 
giving upper bounds on the various increments of h = hf according to special properties 
of / is the crucial step in adapting Stein's method to any one particular metric and is 
typically quite complicated. This step for the metrics is niade in Proposition 3.H 
below. Inequality (3.3) together with this proposition directly yields the statement of 
Theorem 3. A. 

Proposition 3.H. Let p e [l,oo]. If f € T { 2 p) , then 

(i) HA/ilU^c^A); 

(ii) HA^IU^c^A); 

(iii) \(h(£_ +S X )- h(0) - {h{n + 5 X ) - h(n))\ < 4 P) (AK(£, rj) for £, r, e VI and x 6 X. 

Proof. The proof builds on the ideas of the proofs of the corresponding results for the 
case p=l; sec Propositions 5.16 to 5.18 in Xia (2005). In particular, it makes use of 
the representation of the spatial immigration-death process as (t) = (t) + Zo (t) , 
where is a spatial pure death process with unit per capita death rate and starting 
configuration £, Zo is a spatial immigration-death process with the same parameters 
as Zj, but starting with 0-measure, and and Zq are independent (see Xia (2005), 
Proposition 3.5). Write %|(i) := |Zf(i)|, Z (t) := |Z (t)|, and note that Z (t) ~ Po(A t ), 
where A< = A(l — e~*). 

Statement (i). Suppose that 1 <p < oo. Inequality (5.19) in Xia (2005) yields that 



\h(t + 6 x )-h(t)\< / e-*{lA[|E(/(Z £ (t) + * a! )-/(Z c (t) + <y I ,))| 

J ° (3.4) 
+ \E(f(Z i (t)+6 u )-f(Z € (t)))\]}dt 

for a random clement U ~ A/ A of X that is independent of everything else, where 

l/PN 

\E(f(Z s (t)+S x )-f(Z s (t) + Su))\<E' 



< E 



< E 



|Zf(*)l + l 
1 

1 



i/p 

(3.5) 



2b(t) + l 
l_e-A« \ Vp 
At 
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(see inequality (3.11) for details on the first estimate), and 
|E(/(Z e (t) + <fc)-/(Z f (t)))|< 



1 



by inequality (5.23) in Xia (2005) (note that "=" in the last line should be "<"). Hence, 
\h(£ + 5 x )-h(0\ 

-'l_e-A*\Vp ! 





/>oo 


< 


/ e 




.-/() 








a 


< 






A 




«o 




A 



1A 



1 A 



K /X 
P 

A ' p-1 VA 



1-e" 
X7 



As 



A; 

As\ 1 /p 



dt 



1 



V2eAl 



ds 



(3.6) 



V2cAl 



i/p 



1 - 



ds 
l-i/p 



2 1 



1 - 



for A > Ko, where kq was defined such that it satisfies k 1 + (2eKo) = 1. Write n(p) := 
Ki, which can be easily seen to be strictly decreasing in p with limit 2(kq - 



-1 K 



l-l/p 



1) > for p — > oo. For 1 < p < 2, we factor out A 1 / 2 , and maximize the left-over term 



p- 1 V A 



l/p-l/2 



1/2 



(3.7) 



in A. For 1 < p < 2, taking the first and second derivatives shows that a global maximum 
is attained at 



P 



2-p 



e p— 1 \(p— 1)k(p) 



(2-p)/(2(p-l)) 



K(p) 



2-p 



(p-l)K(p) 



P/(2(p-l)) 



(p) 



For p = 2, the term (3.7) is obviously strictly increasing in A, so that letting A — > oo 

yields that 7^ maximizes this term also in the case p = 2. Thus, by inequality (3.6), 

||A/i|| 0O < 7 ^ ) A- 1 / 2 for Kp<2. 
For p > 2, we factor out A _1 / p in inequality (3.6), and maximize the left-over term 



P 



2/1 



1/2-1/p 



k (p) [ j 



1-1/p 



p — 1 ' V c \ A / 

in A. Taking the first and second derivatives shows that a global maximum is attained at 
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Thus, by inequality (3.6), \\AhWoo < "i i ? ) \- 1/p for p > 2. 

In total we have shown, for 1< p < oo, that WAk^ < ^A-VHM if A > k . By 
equation (3.4) we have ||A/i||oo < /n°° e~* di = 1 for any A. Statement (i) is then obtained 
because 7^ > w J/max(2, P ) > A i/ max (2, P ) if A < which follows for p > 2 simply by 

p-^Y > K c/ P and for 1 <p < 2 by using the alternative expression via x from (3.8), the 

inequality (1 + y) r < exp(ry) for r, y > 0, and that (a; + 2)kq — «i — x is maximal 
at x = 0. 

What remains to be shown are the cases p = 1 and p = 00. Since || A/i||oo < 1 holds 
always, the statement for p = 00 is clear. For p = 1, we make use of the fact that / £ .T 7 ^ 
implies / £ and thus + 5 X ) — < (A) holds for every p > 1. Letting p — * 1 

yields the required upper bound, where 7^ — > 7^ follows by substituting a; := so 
that 

2-p \-^)/(P~i) , {x + 2 )n 1 ^ x+2) -K l -x 



= 1 



(p-iMp); v - (3g) 

— ► cxp(2 — «i + log(«o)) — Koe" -0 as a; — > 00. 

Statement (ii). Suppose that 1 < p < 00. As in the first part of the proof of Proposi- 
tion 5.17 in Xia (2005), we obtain that 

+ 5 X + Sy) - h(£ +5 X )~ h(£ + Sy) + h(0 

c- 2t E[/(Z^ (t) + S x + S y ) - /(Z € (t) + 4) (3.9) 

-/(Z c (t)+^) + /(Z c (t))]dt, 
where there are numbers bk(t) £ [0, 1] for k > — 1 such that 

E[/(Z e (t) + 4 + <5,) - /(Z £ (t) + 5.) - /(Z c (t) + <5,) + /(Z c (t))] 

*<(^)>(WH 

00 

+ ^ 6 fc (t)(P[2b(t) = fc-l]-2P[Z (t) = A;]+P[Zo(t) = fc + l]). 
fc=-i 

The only difference between (3.10) and the corresponding inequality on page 155 of Xia 
(2005) are the exponents l/p. They stem from a straightforward adaptation of inequal- 
ities (5.24) and (5.26) in Xia (2005) (note that "=" in the last line of (5.26) should be 
"<")> which is obtained by employing the estimate 



\f(v + S x )-f(v + Sy)\<d^(n + d x ,n + d y )< [ j-f— ) do(x,y)< 



i/p / 1 \ i/p 



1 



(3.11) 
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for 77 G 91. Continuing from equation (3.10), we have 

jr b k (t) (P[Z (i) = k - 1] - 2¥[Z (t) = k] + ¥[Z (t) = k + 1] ) < 1 



(3.12) 



fc=-i 



as shown on page 155 in Xia (2005), and 



E 



%|(i)+2 



1//' 



■ E 



I[%ft)>l] 



E 



1 



i/p 



(3.13) 



< 



2 x /p 4. (2/3) Vp 



k i/p 



where the first inequality is obtained because the sequence ((j-^) 1 ^ + (^p-) 1 ' 35 )*^ is 
seen to be decreasing, and the second inequality follows from (3.5). 

In total, we combine (3.9), (3.10), (3.12), and (3.13), replacing / by (1 - /) G T { 2 p) in 
(3.10) if necessary, to obtain 



\h(£ + S X + 5 y ) - h(£ + S x ) - h(£ + 5 y ) + h(0\ 
'2 1 /P + (2/3) 1 / p 1 



< 



2A 



,i/p 



(1-s) 2A 



at 

1/p 





( k z(p) 


~ A 


V A 




( K 2(p) 


A 


V A 






+ P{p) 



/ 1 \ VP 1 

( 2 Vp + (2/3)^)^_j + _ 

As J 



(Is 



1 / ! x i/p 

(l-s)/J(p)( — ] ds 

k 2 (p)/A 



(3.14) 



p 2 p / Mp)A 

(p-l)(2p-l) p-lV A J 



1-1/p 



P / K 2 (P) \ 

2p-l\ A J 



2-i/p-i 



for A > « 2 (p), where K 2 (p) := (/3(p)/2) p and f3{p) := (1 + 2^p + (2/3) 1 / p ). We factor out 
A _1 / p , and find a bound for the left-over term 



P(P)P 2 



(p-i)(2p-i) v 2 y p-iva 



2p 



V 2 y 2p-lVA 



2-1/p 
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on A G [(/3(p)/2) p , oo). From the first derivative we can see that this term is strictly 
increasing on the whole interval, so that the desired bound is obtained by letting A go to 
infinity. Hence HA 2 ^^ < 7 ^ ) A- 1 / p if A > (/3(p)/2) p and, by the first inequality in (3.14), 
||A 2 /i||oo < J °°2e- 2t di = 1 for any A. Noting that 7^ A -1 /* > 1 for A < (f3(p)/2)P, we 
obtain Statement (ii) for 1 <p < 00. 

The case p=l was proved as Proposition 5.17 in Xia (2005). Since || A 2 /i||oo < 1 holds 
always, the case p = 00 is obvious. 

Statement (iii). Suppose that 1 < p < 00. We step by step adapt the proof of Propo- 
sition 5.18 in Xia (2005). Write £ = an d fj = J2iLi assuming without loss 
of generality that n < m and that the points of £ and 77 are numbered according to a 

-pairing, that is, such that (m — n) + J27=i do{xi,yi) = d'i(£, n). Let rjj := Y^,7=i ^vt f° r 
< j < m — n. Then 

\m + 5 x )-h(0)-(Kv + S x )-h( V ))\ 

< \{h(t + 5 X ) - h(0) - (Mvo + S x ) - h( m ))\ (3.15) 

+ \(h( m + 5 X ) - h ( t]q ) ) - {h(r) + S x ) - h(ri))\, 

where the second summand can be estimated as 

\(h(m + S x ) - h{r] )) - (h{n + 5 X ) - h(n))\ 

m — n 

<Y,\( h (^+ s -)- h ^)-( h ^+ s -)- h ^))\ ( 3 - 16 ) 

< ||A 2 /i|| 00 (?7i-n). 

The first summand in (3.15) is zero if n = 0. For n > 1, write £j = Y2i=i + ^27=j <^/« 
for 1 < j < n + 1, so that 

+ 4) - M0) - (M% + s x ) - h( Vo ))\ 

n 

<g( dofe , 9j) ^^e-- E (( 5 -i rrT ) 1/ ')d t ) (3.17) 

n 

<4 rt (A)J>(^,%), 
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where the second estimate is obtained from the first inequality in the proof of Lemma 5.15 
in Xia (2005) (adjusted by using (3.11) in the last line), the third estimate holds by 
inequality (3.5), and the last estimate follows from the proof of Statement (ii) (which 
shows that the second line of inequality (3.14) is bounded by cf\\)). The combining of 
(3.15), (3.16), and (3.17) yields Statement (iii) for 1 <p < oo. 

The case p = 1 was proved as Propositon 5.18 in Xia (2005). The case p = oo follows 
by the same proof as above, but bounding the term in the second line of inequality (3.17) 
by I2]Li(do(xj,yj) J °° 2e" 2 'di) = Y%=i da(xj,yj), which is done by using |/(r? + S x ) - 
f(r] + S y )\ < do(x,y) instead of (3.11) in the proof of Proposition 5.15 in Xia (2005). □ 



Appendix: Proofs of the Lipschitz continuities in 
Section 2.2 



Proof of Proposition 2. A. It is obvious that |/(£) — f{v)\ < ^i (£1^) i s satisfied 
for £,77 £ 0T with |£| 7^ |t^| (since im(/) C [0,1]) or with |£| = < I (since in this case 
/(£) = f(v) = 0)- Suppose then that £ = JZiLi an( l V — Y^7=i ^yt> where n > I and 
where the points of £ and rj are numbered according to a d^-pairing, that is, such that 

in J2i=i d-oi^i, ViY) 1 ^ = d^i^rj). Note that inequality (2.3) together with Lyapunov's 
inequality implies that 



\K(m,Ui, ...,ui) - K(m,vi, 



(\ 1 V 1 1 

V i=i / »=i 



Using the inverse triangle inequality for £ p -norms in the first line, we then obtain that 
\f(0-f(v)\ p <-L E \K(n;x il ,...,x il )-K(n;y il ,...,y il )\i> 

U/ l<i 1 <...<i,<n 

1 1 - 

V 1 1 \<%x< — <ii<n r=l 

n (A - 1} 

= (d?\t,r 1 )T. □ 

Proof of Proposition 2.B. We show — f(jl)\ < 2d^ {£,v) m the non-obvious 

case. Let £ = Y^l=i an d rj = Y^i=i ^vt > where again n > I and the points of £ and rj are 
numbered according to a -pairing. Using the inverse triangle inequality for ^,-norms 



Stein's method and Poisson process approximation 



567 



for the first and the usual triangle inequality for the second relation, we obtain that 

1/(0 -/wi 

^(7^ E \(K(n;x n ,...,x n )-K(n;y ll ,...,y H ))-(K(0-K( V ))A 

l<ii<— <ii<n / 

< f ^ E l*< 

\\lJ l<i x <—<ii<n 



\ VP 

n;a; u ,...,a;, i )-i ; C(n;y u ,...,y li )| p + \K(g) - K (r?)| 



< 24 33) (e,/?) 

by inequality (A.l) (once for general p and once for p = 1) and inequality (2.1). □ 

Proof of Proposition 2.C. Obviously, \ f(£)-f{rf)\ < d^\^,r]) if |£| 7^ |??| or |£| = ^ < 
2. Suppose then that £ = X)"=i an d r 7 = E"=i ^vt > where n > 2 and the points of £ and 
77 are numbered according to a -pairing. Let J{i) be the index of a nearest neighbor 
(with respect to | • | and hence do) of Xi within the points of £ and K (i) the index of a 
nearest neighbor of yi within the points of n. For i fixed, we have 

do(xi,xj^) < d (x t ,x K{l) ) < d (xi,yi) + d {vi,y K (i)) + d (y K ^,x K ^)), 

and 

do(yi,y K (i)) <d (y l ,y J{l) ) < do(yi,Xi) + d (xi,x J{l) ) + d Q (x J(l) ,y J(l) ), 
so that altogether 

\do (Xi , x J{ i) ) - d (yi , y K{t) ) \ < d (xi , y t ) + d (x L (i) , y L{i) ) , 

where L(i) :=K(i) if do(zi,Xj/{\) > do(yi,yx(i)) and L(i) := J(i) otherwise. By the in- 
verse triangle inequality for ^-norms, we obtain now 



1 - 

\f(Z)-f(v)\ P < -^2\do(x u Xj (i) ) -d (yi,y Kil) )\ p 
i=l 

1 - 

< - }](do{xi,yi) + do(x Lii) ,y L ( i) )) p 
n i=i 

( 1 ™ 1 " 

<2" -Vd (^,y l ) p + -E d o( a; i 

\ i=l i=l 



{i),VL(i) 

lb 

<2f(2r D + l)(4 p) (£, ?7 )) p , 

using for the last inequality that any point of a point pattern in (M. D , | • |) can be nearest 
neighbor to at most m other points (see Zeger and Gersho (1994), Theorem 1). The 
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factor 2 P is obviously unnecessary if p = 1 . In Schuhmacher (2005a) a Lipschitz constant 
of td + 1 was obtained for p = 1 by a more complicated proof. □ 
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